AITopics | enhancing reinforcement learning

Collaborating Authors

enhancing reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploiting the Replay Memory Before Exploring the Environment: Enhancing Reinforcement Learning Through Empirical MDP Iteration

Neural Information Processing SystemsMay-31-2025, 15:37:12 GMT

Reinforcement learning (RL) algorithms are typically based on optimizing a Markov Decision Process (MDP) using the optimal Bellman equation. Recent studies have revealed that focusing the optimization of Bellman equations solely on in-sample actions tends to result in more stable optimization, especially in the presence of function approximation. Upon on these findings, in this paper, we propose an Empirical MDP Iteration (EMIT) framework. For each of these empirical MDPs, it learns an estimated Q-function denoted as \widehat{Q} . The key strength is that by restricting the Bellman update to in-sample bootstrapping, each empirical MDP converges to a unique optimal \widehat{Q} function.

algorithm, empirical mdp iteration, enhancing reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Enhancing Reinforcement Learning for the Floorplanning of Analog ICs with Beam Search

Della Rovere, Sandro Junior, Basso, Davide, Bortolussi, Luca, Videnovic-Misic, Mirjana, Habal, Husni

arXiv.org Artificial IntelligenceMay-9-2025

The layout of analog ICs requires making complex trade-offs, while addressing device physics and variability of the circuits. This makes full automation with learning-based solutions hard to achieve. However, reinforcement learning (RL) has recently reached significant results, particularly in solving the floorplanning problem. This paper presents a hybrid method that combines RL with a beam (BS) strategy. The BS algorithm enhances the agent's inference process, allowing for the generation of flexible floorplans by accomodating various objective weightings, and addressing congestion without without the need for policy retraining or fine-tuning. Moreover, the RL agent's generalization ability stays intact, along with its efficient handling of circuit features and constraints. Experimental results show approx. 5-85% improvement in area, dead space and half-perimeter wire length compared to a standard RL application, along with higher rewards for the agent. Moreover, performance and efficiency align closely with those of existing state-of-the-art techniques.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2505.05059

Country: Europe (0.68)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Enhancing Reinforcement Learning in Sensor Fusion: A Comparative Analysis of Cubature and Sampling-based Integration Methods for Rover Search Planning

Ewers, Jan-Hendrik, Swinton, Sarah, Anderson, David, McGookin, Euan, Thomson, Douglas

arXiv.org Artificial IntelligenceMay-15-2024

This study investigates the computational speed and accuracy of two numerical integration methods, cubature and sampling-based, for integrating an integrand over a 2D polygon. Using a group of rovers searching the Martian surface with a limited sensor footprint as a test bed, the relative error and computational time are compared as the area was subdivided to improve accuracy in the sampling-based approach. The results show that the sampling-based approach exhibits a $14.75\%$ deviation in relative error compared to cubature when it matches the computational performance at $100\%$. Furthermore, achieving a relative error below $1\%$ necessitates a $10000\%$ increase in relative time to calculate due to the $\mathcal{O}(N^2)$ complexity of the sampling-based method. It is concluded that for enhancing reinforcement learning capabilities and other high iteration algorithms, the cubature method is preferred over the sampling-based method.

learning, rover, triangle, (13 more...)

arXiv.org Artificial Intelligence

2405.08691

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Enhancing Reinforcement Learning with discrete interfaces to learn the Dyck Language

Dietz, Florian, Klakow, Dietrich

arXiv.org Artificial IntelligenceOct-27-2021

Even though most interfaces in the real world are discrete, no efficient way exists to train neural networks to make use of them, yet. We enhance an Interaction Network (a Reinforcement Learning architecture) with discrete interfaces and train it on the generalized Dyck language. This task requires an understanding of hierarchical structures to solve, and has long proven difficult for neural networks. We provide the first solution based on learning to use discrete data structures. We encountered unexpected anomalous behavior during training, and utilized pre-training based on execution traces to overcome them. The resulting model is very small and fast, and generalizes to sequences that are an entire order of magnitude longer than the training data.

interface, neural network, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2110.1435

Country: Europe > Germany > Saarland (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback